BMC Medical Genomics
○ Springer Science and Business Media LLC
Preprints posted in the last 30 days, ranked by how well they match BMC Medical Genomics's content profile, based on 12 papers previously published here. The average preprint has a 0.05% match score for this journal, so anything above that is already an above-average fit.
Orkild, M. R.; Dybdahl, K. L.; Duun Rohde, P. D.
Show abstract
Inflammatory bowel disease (IBD) frequently co-occurs with immune-mediated and metabolic disorders, but whether these associations reflect shared genetics or causal effects remains unclear. We performed two-sample Mendelian randomization (MR) using large-scale genome-wide association study (GWAS) summary statistics to investigate potential causal effects of immune-mediated diseases and lifestyle traits on IBD, Crohns disease (CD), and ulcerative colitis (UC). SNP-based heritability and genetic correlations were estimated to contextualize findings. Following false discovery rate correction, genetically predicted psoriasis was positively associated with IBD (OR 1.15), CD (OR 1.23), and UC (OR 1.10), with the strongest effect observed for CD. Genetically predicted type 2 diabetes mellitus (T2DM) showed a modest inverse association with UC (OR 0.88). No lifestyle-related traits remained significant after correction. Sensitivity analyses indicated heterogeneity across instruments and evidence of directional pleiotropy in selected models, whereas no pleiotropy was detected for the T2DM-UC association. These findings support a role of psoriasis-related immune pathways in IBD susceptibility and suggest a potential inverse association between genetic liability to T2DM and UC.
Li, Y.; Cornejo-Sanchez, D. M.; Dong, R.; Naderi, E.; Wang, G. T.; Leal, S. M.; DeWan, A. T.
Show abstract
The genetic relationship between asthma and lung function may be dependent on age-at-onset (AAO) of asthma. We investigated whether the shared genetics between asthma AAO and lung function is dependent on AAO. Asthma cases from UK Biobank were subset according to their AAO and genetic correlation was used to obtain genetically homogeneous groups, i.e., [≤]20 (LT20), 20-40, and >40 (GT40) years. Association analysis and fine-mapping were performed to identify shared genetics between AAO groups and lung function. Mediation and quantitative trait locus (QTL) analyses were performed to identify mechanisms underlying shared genetic associations. Chr5, chr6, chr12, and chr17 each had one region that displayed a cross-phenotype replicated association with at least one AAO group and lung function. Overlapping credible sets obtained from fine-mapping were observed on chr5 and chr6. Mediation analyses demonstrated that for each region the proportion mediated through asthma on lung function was larger for asthma LT20 compared to 20-40 and GT40 suggesting that their effects on lung function were more strongly driven by this association. Tissue-specific QTL analysis revealed shared etiology on chr5 may be acting through SLC22A5 and C5orf56 which might play an important role in decreased lung function among individuals with earlier-onset asthma.
Cremin, C.; Elavalli, S.; Paulin, L.; Arres Reche, J.; Saad, A. A. Y. A.; Attia, A.; Minas, C.; Aldhuhoori, F.; Katagi, G.; Wu, H.; Sidahmed, H.; Mafofo, J.; Soliman, O.; Behl, S.; Pariyachery, S.; Gupta, V.; Ghanem, D.; Sajjad, H.; Cardoso, T.; El-Khani, A.; Al Marzooqi, F.; Magalhaes, T.; Sedlazeck, F. J.; Quilez, J.
Show abstract
BackgroundThe hyperpolymorphic nature and structural complexity of the human leukocyte antigen (HLA) genomic region present challenges for accurate and scalable typing across diverse sample types. While wholegenome sequencing (WGS) offers the opportunity to infer HLA genotypes without targeted enrichment, systematic benchmarks across sequencing platforms, biospecimens and coverage levels remain limited. ResultsWe assembled a multi-platform resource of WGS datasets derived from short-read (Illumina, MGI) and long-read (Oxford Nanopore Technologies R9 and R10) sequencing, spanning 29 biospecimens including cell lines, blood, buccal swab and saliva. We evaluated the performance of the HLA caller HLA*LA across 13 HLA genes, using a clinically validated assay as reference. WGSbased HLA genotyping achieved [~]95% accuracy across sequencing platforms, with Class I loci exhibiting higher accuracy than Class II. Crossplatform concordance was high, and performance remained consistent across Illumina, MGI and Oxford Nanopore chemistries. Analysis of blood, buccal swab and saliva samples showed that blood and buccal swabs supported accurate HLA inference, whereas saliva yielded reduced concordance. Downsampling experiments demonstrated that 15x coverage was sufficient to retain >95% accuracy at twofield resolution, with lower depths supporting lower-resolution typing. ConclusionsOur results demonstrate that WGS provides a robust, platformagnostic framework for accurate HLA genotyping across sample types and coverage levels. These benchmarks establish practical conditions for reliable HLA inference and underscore the utility of WGS for populationscale HLA analyses and future clinical applications.
Pehova, Y.; Apella, S.; Kolobkov, D.; Malinowski, A. R.; Pawlowski, M.; Strivens, M. A.; Sardell, J.; Gardner, S.
Show abstract
BackgroundHypertension affects over 30% of adults and is the leading risk factor for cardiovascular disease. It often presents without obvious symptoms, meaning that, although effective therapies exist, hypertension remains widely undiagnosed and insufficiently treated. Genomics-based prediction methods have shown only modest benefits for these disorders, but proteomic markers have demonstrated potential for greater predictive and clinical value. MethodsWe applied a novel machine-learning based patient stratification analysis pipeline to proteomics data for 7,086 hypertension patients from UK Biobanks Pharma Proteomics Project cohort (2,911 proteins). We evaluated the contribution of each protein to the output of a tree-based risk model to explore the combinations of protein expression values that naturally separate hypertension cases into clusters and assessed the prevalence of cardiovascular and renal complications within each obtained cluster. ResultsWe identified 10 clusters of hypertension patients segregated by differential expression of HAVCR1, PLAT, PTPRB, REN and RTN4R. Four of these clusters showed statistically significant enrichment for cardiovascular and renal complications, and three of them had significantly lower prevalence of complications than expected among hypertension patients. ConclusionWe hypothesize that the hypertension clusters identified may represent distinct mechanistic subtypes. With further study this could help focus studies on subgroups of hypertension patients with a shared disease etiology, identify more personalized precision medicine treatment options for each subgroup, and develop mechanism-based biomarker tests to support enriched clinical trial recruitment.
Joof, E.; Hernandez-Beeftink, T.; Parcesepe, G.; Massen, G. M.; Nabunje, R.; Power, H. J.; Woodward, R.; Altunusi, F.; Leavy, O. C.; Longhurst, H. J.; Jenkins, R. G.; Quint, J. K.; Wain, L. V.; Allen, R. J.
Show abstract
IntroductionFibrosis can affect organs throughout the body and is present in a wide range of diseases. Recent research has suggested that there could be shared biological mechanisms that lead to fibrosis in different organs. MethodsWe performed genome-wide association studies using UK Biobank for fibrosis in 12 different organ-systems and meta-analysed results with previously published studies of fibrotic diseases. We considered genetic associations that colocalised across [≥]3 organs as those likely to be involved in general fibrotic mechanisms and also identified novel genetic variants not previously reported as associated with fibrosis. Genetic correlation of fibrosis between organs was calculated using linkage disequilibrium score regression (LDSC). Discovery analyses were performed using European ancestry individuals and results were tested further in African, South Asian and East Asian ancestry groups. ResultsWe identified eight genetic loci that colocalised across three or more organs. One of these signals, located near the SH2B3 and ATXN2 genes, showed evidence of a shared causal variant for fibrosis across five organs. We also identified two novel fibrotic associations, one implicating alternative splicing of TFCP2L1 for urinary fibrosis and another implicating a missense variant in FAM180A for intestinal-pancreatic fibrosis. We observed significant genetic correlations for all organs, particularly for liver and skeletal fibrosis. ConclusionWe found evidence of shared genetic associations for fibrosis across organs, both at individual genetic loci and genome-wide. This highlights specific genes that may contribute to fibrosis across organs and diseases, which may facilitate the development of new therapies.
Mackie, K.; Kemp, H.; Gunnell, A.; Studd, J. B.; Went, M.; Law, P.; Tomczyk, K.; Sevgi, S.; Lu, Y.; Orr, N.; Houlston, R. S.; Johnson, N.; Fletcher, O.; Haider, S.
Show abstract
Genome wide association studies (GWAS), combined with fine-mapping have identified 196 independent signals associated with breast cancer risk. Deciphering the functional basis of these associations can inform our understanding of the biology and aetiology of breast cancer. Decoding GWAS risk associations is challenging due to linkage disequilibrium between variants and because most variants map to non-coding regions, influencing breast cancer risk via cis-regulatory mechanisms that modulate the expression of target genes. To identify the functional variants driving breast cancer risk associations, we carried out a lentivirus-based massively parallel reporter assay (lentiMPRA) to screen 5,116 credible causal variants across these signals. We identified 709 variants mapping to 140 risk regions, that are associated with significant variation between REF and ALT alleles. A follow-up investigation at 14q32.11 revealed rs7153397 may impact expression of CCDC88C to influence both breast cancer risk and prognosis. These findings provide a prioritised set of functional variants for downstream analyses, advancing our understanding of breast cancer risk mechanisms.
Hernandez Beeftink, T.; Donoghue, L. J.; Izquierdo, A.; Moss, S. T.; Chin, D.; Guillen-Guio, B.; Bhatti, K. F.; Biddie, S.; Shrine, N.; Packer, R.; Adegunsoye, A.; Booth, H. L.; Fahy, W. A.; Fingerlin, T. E.; Hall, I. P.; Hart, S. P.; Hill, M. R.; Hirani, N.; Kaminski, N.; Lopez-Jimenez, E.; Lorenzo-Salazar, J. M.; Ma, S.-F.; McAnulty, R. J.; McCarthy, M. I.; Stockwell, A. D.; Maher, T. M.; Millar, A. B.; Molyneaux, P. L.; Molina-Molina, M.; Navaratnam, V.; Neighbors, M.; Oldham, J. M.; Parfrey, H.; Saini, G.; Sayers, I.; Sheng, X. R.; Strek, M. E.; Stewart, I.; Tobin, M. D.; Whyte, M. K.; Zha
Show abstract
RationaleIdiopathic pulmonary fibrosis (IPF) is a rare, chronic, progressive lung disease with high mortality and few treatment options. Using an additive genetic model, genome-wide association studies (GWAS) have identified multiple risk loci highlighting new genes and pathways of interest. Since IPF risk could also be influenced by non-additive effects, we hypothesised that association analyses using alternative genetic models may provide additional mechanistic insight. ObjectivesTo perform GWAS of IPF susceptibility to detect associations where the underlying effects are consistent with recessive or dominant genetic models. MethodsWe performed GWAS of IPF susceptibility, with logistic regression assuming dominant or recessive genetic models, including 5,159 IPF cases, from clinically-curated sources, and 27,459 controls. We functionally annotated independent signals and performed variant-to-gene mapping, applying fine-mapping to define potentially causal variants and genes. We assessed differential expression levels of genes of interest in publicly available single cell RNAseq data and in primary cells derived from IPF donors and controls. Main ResultsWe identified five genome-wide significant signals, under a recessive model, that had not been reported previously. These included exonic variants in the cell-cycle gene Polyamine-Modulated Factor 1 (PMF1) and in Epsin 3 (EPN3) genes. We also observed evidence of increased PMF1 expression in airway basal cells of IPF patients compared to controls. ConclusionsUsing alternative genetic models in IPF susceptibility GWAS identified new signals and genes, providing new insights into IPF pathogenesis and potential future therapies.
Iacovazzo, D.; Begalli, F.; Suleyman, O.; Doleschall, M.; Alevizaki, M.; Ashelford, K. E.; Awad Mahmoud, S.; Barlier, A.; Barry, S.; Brain, C.; Cabrera, C. P.; Castinetti, F.; Chiloiro, S.; Colclough, K.; Csabi, A.; Druce, M. R.; Dutta, P.; Fatih, J. M.; Foulkes, W. D.; Gandhi, M.; Grochowski, C. M.; Hall, C. L.; Jarzab, B.; Klein, K. O.; Krajewska, J.; Kurzawinski, T. R.; Lamers, S.; Lugli, F.; Magid, K.; Margraf, R.; Martin, C. S.; Mathiesen, J. S.; Mihai, R.; Morrison, P. J.; Mozere, M.; Oczko-Wojciechowska, M.; Owens, M.; Ozretic, L.; Patocs, A.; Piacentini, S.; Punetha, J.; Romanet, P.; S
Show abstract
While most individuals with familial medullary thyroid carcinoma (fMTC) carry RET mutations, in some instances the causative mutations remain unknown. We studied two related families with RET-negative fMTC in 21 affected individuals through linkage analysis, exome/genome sequencing, and high-density array comparative genomic hybridization. We identified a novel heterozygous 40kb intragenic SLC30A9 deletion which segregated with the disease in all affected individuals. The mutant transcript escaped nonsense-mediated decay and resulted in the production of N-terminally truncated proteins via translation reinitiation from in-frame AUG codons located downstream of the deletion. These proteins showed increased stability and their expression in an MTC cell line increased cell proliferation and clonogenic capacity, supporting an oncogenic role. These findings expand the genetic background of fMTC beyond RET mutations and implicate translation reinitiation in the etiology of cancer susceptibility syndromes secondary to structural genomic variants.
Morrow, E. H.; Harper, J. A.
Show abstract
Trade-offs form a key constraint in many aspects of organismal evolution, though they may help maintain genetic diversity. Late-onset Alzheimers disease (LOAD) shows features in common with the male-female health survival paradox: females suffer from higher prevalence and risk, as well as faster rates of cognitive decline while males suffer higher mortality. Though antagonistic pleiotropy could explain the tendency of LOAD to appear late in life, the sexually dimorphic profile suggests a role for intralocus sexual conflict. Using published data on sex-specific genetic associations with LOAD risk, we found evidence for a number of sexually antagonistic loci, where alleles with net negative effects that reduce male risk but increase female risk are more common than alleles with reversed effects. Multiple lines of evidence also suggest there is an inverse relationship between cancer and LOAD risk. The combined effect of sexual antagonism and antagonistic pleiotropy could explain the persistence of alleles that increase LOAD risk in post-reproductive females, if they also reduce cancer risk in males. This framework could be applied to other female-biased late-life conditions, and our results may be useful in informing polygenic risk scores or therapies where genotype-by-sex effects may result in undesired outcomes for one particular sex.
O'Brien, A.; Kong, H.; Patel, H.; Ho, M.; Patel, M. B.; Zhong, J.; Xu, M.; Papenberg, B. W.; Connelly, K. E.; Collins, I.; Hennessey, R.; Thakur, R.; Sowards, H.; Funderburk, K.; Luong, T.; Florez-Vargas, O.; Myers, T.; Jermusyk, A.; Gorman, B.; Luo, W.; Jones, K.; Das, S.; Lan, Q.; Rothman, N.; McKay, J. D.; Hung, R. J.; Amos, C. I.; Iles, M. M.; Koutros, S.; Landi, M. T.; Law, M. H.; Stolzenberg-Solomon, R. Z.; Wolpin, B.; Hassan, M.; Klein, A. P.; Antwi, S. O.; Orr, N.; Chanock, S. J.; Lindstroem, S.; Hoskins, J. W.; Stern, M.-H.; Andresson, T.; Shi, J.; Prokunina-Olsson, L.; Choi, J.; Brow
Show abstract
Chromosome 5p15.33 harbors several independent association signals which demonstrate antagonistic pleiotropy across cancer types, with causal mechanisms largely unresolved. To identify functional variants and enhancer elements at this locus, we performed statistical fine-mapping followed by massively parallel reporter assays (MPRA) and proliferation based CRISPRi screens. This approach identified eight multi-cancer functional variants (MCFVs) across three GWAS signals. Targeting rs421629 (part of the CLPTM1L signal marked by rs465498) with CRISPRi revealed opposing effects on TERT expression in pancreatic versus lung cancer cells, consistent with the antagonistic pleiotropy observed for this signal. Furthermore, CRISPRi nominated an intronic CLPTM1L variable number tandem repeat (VNTR) as a potent enhancer. Long-read sequencing established VNTR polymorphisms as potential causal variants for the rs465498 signal. We showed that Hippo-pathway transcription factors mediate VNTR enhancer activity in lung and pancreatic cancer cells. Together, these findings indicate that cancer susceptibility at 5p15.33 may be mediated by both SNPs and VNTRs and provide an integrated framework for resolving complex pleiotropic loci.
Lalaurie, C.; Liu, L.; Khan, A.; Wang, C.; Rich, S.; Barr, R. G.; Bernstein, E.; Kiryluk, K.; McDonnell, T. C. R.; Luo, Y.
Show abstract
Anti-{beta}2-glycoprotein I (anti-{beta}2GPI) antibodies are central to the pathogenesis of antiphospholipid syndrome (APS), an autoimmune disease characterized by a strong predisposition to venous thromboembolism (VTE). In this study, we conducted a multi-ancestry genome-wide association study (GWAS) of quantitative total anti-{beta}2GPI levels in 5,969 participants enrolled in the Multi-Ethnic Study of Atherosclerosis (MESA) and identified a genome-wide significant association at the APOH locus. Paradoxically, genetically determined increases in anti-{beta}2GPI levels at this locus were associated with lower VTE risk. Fine-mapping and functional genomics prioritized the missense variant rs1801690 (W335S) in {beta}2GPI (apolipoprotein H, [APOH]) as the most likely causal variant. This variant has an allele frequency of 5-6% in European and East Asian ancestries but only 1% in African ancestries. Integrating prior experimental studies, molecular dynamics simulations and structure-based epitope prediction, we propose a dual-effect mechanism whereby W335S reduces thrombotic risk by disrupting phospholipid binding in Domain V, yet increases autoantibody production through conformational changes that enhance epitope exposure in Domains I and II. These findings mechanistically uncouple autoantibody formation from thrombotic risk in carriers of the W335S variant, and suggest that APOH genotype may represent a clinically relevant genetic biomarker with potential utility for thrombotic risk stratification in anti-{beta}2GPI-positive individuals.
Gupta, A.; Muthuswami, M.
Show abstract
Clinical interpretation of breast cancer sequencing is constrained not by a lack of data but by the absence of an organising framework that translates constellations of co-occurring mutations and copy-number alterations into tumour-level biology with prognostic and therapeutic meaning. This challenge is exemplified by PIK3CA, a clinically actionable alteration often treated as a single-label biomarker despite context-dependent associations with outcome. We analysed >5,000 breast tumours across multiple cohorts using integrated multi-omics (somatic mutations, copy-number, transcriptomic, proteomic and phosphoproteomic profiles) and quantified the directionality of downstream molecular consequences of recurrent alterations relative to TP53-associated trends to infer dominant tumour programmes. This revealed a robust functional organisation comprising (i) a canonical proliferative/replicative programme, enriched for cell-cycle, DNA replication and E2F signalling, and encompassing TP53 mutations and most recurrent CNAs, and (ii) a non-canonical signalling/cell-state programme marked by recurrent mutations including PIK3CA, CDH1, GATA3, MAP3K1 and AKT1, with opposing transcriptomic/proteomic directionality, comparatively lower proliferative output and a systematic tendency towards mutual exclusivity with TP53, consistent with alternative evolutionary routes. To operationalise these findings for clinical use, we developed T-OMICS (Tiered OMICS Classification System), which layers complementary readouts to deliver a single interpretable tumour profile: Tier 1 provides a continuous genomic-risk backbone via a DNA-anchored prognostic RNA signature capturing canonical proliferative/replicative output; Tier 2 assigns programme identity based on the dominant genomic context; Tier 3 quantifies within-programme activity along a continuum; and Tier 4 overlays non-redundant modifier mutations that refine phenotype, vulnerabilities and resistance liabilities, supported by orthogonal proteomic/phosphoproteomic pathway signals. In ER+/HER2- disease, T-OMICS resolves the prognostic ambiguity of PIK3CA by showing that "PIK3CA-mutant" is not a single biological entity: in a predominant low-genomic-score context, PIK3CA aligns with buffered luminal biology and favourable outcomes, whereas in high-score contexts--conditioned by TP53 background and modifier events--PIK3CA can mark adverse biology with distinct dependencies not captured by proliferation-centric readouts; notably, low-score PIK3CA tumours with CDH1 co-mutation shift to significantly worse outcomes. Together, these results establish a programme- and state-aware framework that converts sequencing reports into clinically legible tumour biology to support risk calibration, therapeutic prioritisation and evolution-aware sampling decisions from early-stage through metastatic ER+/HER2- breast cancer. Lay SummaryBreast cancer tumours often carry many genetic changes at the same time. While modern sequencing can identify these changes in detail, the results are frequently presented as long lists of mutations and DNA alterations that are difficult to interpret in terms of how a tumour behaves or how it should be treated. A well-known example is the PIK3CA gene: although it can be targeted with specific drugs, studies have reported mixed results on whether PIK3CA mutations are associated with better or worse outcomes, making it challenging to use this information confidently in clinical care. To address this problem, we analysed genomic (DNA-wide), RNA, and protein data from more than 5,000 breast tumours. We found that many common genomic changes cluster into two main biological "programmes" that reflect distinct ways tumours grow and survive. One programme is driven by rapid cell division and DNA replication and includes TP53 mutations and many common DNA copy-number changes; tumours following this programme tend to be more aggressive. The second programme is less focused on rapid growth and is defined by mutations such as PIK3CA, CDH1, GATA3, MAP3K1, and AKT1, which influence signalling and cell identity rather than directly accelerating proliferation. These programmes reflect broader tumour behaviours rather than the effects of single genes. Importantly, mutations in the second programme are usually not found alongside TP53 mutations, suggesting that breast cancers can develop through distinct biological routes--with some tumours following an alternative pathway (not overtly proliferation-dependent) that shapes their behaviour and may influence which treatments are most appropriate. Based on these findings, we developed a practical classification system, T-OMICS, for ER-positive, HER2-negative breast cancer. T-OMICS summarises which biological programme a tumour follows, how active or aggressive it is within that programme, and whether additional mutations are present that may influence treatment response or resistance. Using this framework, we show that PIK3CA mutations most often occur in a biologically buffered context associated with more favourable outcomes, but when they occur in more aggressive tumours--shaped by other key genetic changes--they can signal a higher-risk disease with different treatment needs. These findings indicate that treatment decisions should be based on the tumours overall biological pattern, not just the presence of a single mutation. By placing sequencing results in this broader context, T-OMICS supports more accurate risk assessment, better treatment planning, and more informed decisions about when to intensify therapy, from early-stage through advanced breast cancer. O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=134 SRC="FIGDIR/small/26346715v1_ufig1.gif" ALT="Figure 1"> View larger version (38K): org.highwire.dtl.DTLVardef@a602e7org.highwire.dtl.DTLVardef@108a6b1org.highwire.dtl.DTLVardef@f7ef9forg.highwire.dtl.DTLVardef@194b86d_HPS_FORMAT_FIGEXP M_FIG O_FLOATNOGraphical SummaryC_FLOATNO C_FIG
Kovanda, A.; Hodzic, A.; Kotnik, U.; Visnjar, T.; Podgrajsek, R.; Andjelic, A.; Jaklic, H.; Maver, A.; Lovrecic, L.; Peterlin, B.
Show abstract
STUDY QUESTION[Do structural genomic variants, that can be identified by using optical genome mapping, contribute to male infertility?] SUMMARY ANSWER[By using optical genome mapping we can identify several types of structural variants, both known and new, that may contribute to male infertility.] WHAT IS KNOWN ALREADY[Traditional approaches such as karyotyping, CFTR and chromosome Y microdeletion testing are successful in explaining clinical findings in [~]30% of MI patients, leaving the rest without a genetic diagnosis. Recent research suggests at least 265 genes may play a role in male fertility. While the assessment of the roles of copy number variants and single nucleotide variants in monogenic forms of disease in these genes is underway, much less is known about structural variants.] STUDY DESIGN, SIZE, DURATION[We performed a longitudinal case/control study on a total of 220 individuals; 88 patients with male infertility, negative for cytogenetic abnormalities using karyotyping, and molecular testing for chrY microdeletions, and CFTR gene variants, and 132 healthy male individuals that underwent optical genomic mapping for other reasons. Exclusion criteria for the control cohort were low-sperm quality and/or inclusion in IVF procedures. The study was approved by the National Medical Ethics Committee of the Republic of Slovenia (reference number: 0120-213/2022/6). Optical genome mapping was performed from an aliquot of whole blood collected for routine testing purposes at the Clinical Institute of Genomic Medicine (CIGM), UMC Ljubljana from January 2023 to November 2024.] PARTICIPANTS/MATERIALS, SETTING, METHODS[We examined structural variants in 220 participants by using optical genome mapping, which was performed with DLE-1 SP-G2 chemistry and the Saphyr instrument. The de novo assembly and Variant Annotation Pipeline were executed on Bionano Solve3.7_20221013_25 while reporting and direct visualization of structural variants was done on Bionano Access 1.7.2. All obtained variants were filtered using the Bionano Access software and in-house generated gene/regions of interest panel bed files. The first filter was applied to include variants below a population frequency of 10%, and overlapping the regions of interest. Subsequently, all variants occurring with frequency 0% in the internal manufacturer variant dataset were manually evaluated for possible involvement of the overlapping genes or regions in biological processes involved in MI. The male infertility cohort also underwent research whole exome analyses as previously reported. All results of optical genomic mapping were confirmed by an appropriate alternative method where available.] MAIN RESULTS AND THE ROLE OF CHANCE[We show that the overall number of structural variants in MI patients does not differ from that of healthy individuals. By looking in detail at genes and regions associated with MI, we identified 21 rare variants absent from controls in 25.0 % of MI patients, of which five were likely causative, and two would be missed by using traditional approaches. These variants include inversions, duplications, amplifications, deletions (e.g. SPAG1), and insertions/expansions (e.g. DMPK), that were validated using additional methods. While the remaining SV cannot be currently classified as pathogenic according to existing criteria, they open a new avenue in genetic research of MI. LARGE SCALE DATA[Variants reported in this study were deposited into ClinVar under accession numbers SUB15650956 (https://www.ncbi.nlm.nih.gov/clinvar/)] LIMITATIONS, REASONS FOR CAUTION[Technical limitations of optical genome mapping include the lack of DLE-1 labelling of centromeric and telomeric regions, the inability to detect Robertsonian translocations, the unclear exact location of smaller structural variants located between the DLE-1 labels, and unclear boundaries in case of their location in segmentally duplicated regions (this limitation is shared with other methods). The ACGM criteria of rarity are also hard to apply, as the fertility status of the individuals in healthy population databases such as GnomAD and DGV is unknown. Similarly, gene-associated phenotype and the proposed inheritance model both need to be considered as parts of the ACMG criteria, but for many candidate genes associated with MI, no model of inheritance has yet been proposed.] WIDER IMPLICATIONS OF THE FINDINGS[Currently, with the established diagnostic approaches we are able to resolve [~]30% of male infertility cases, with [~]70% of patients remaining undiagnosed. The significance of our work is in showing that rare structural variants can be identified in MI, by using optical genome mapping, opening new avenues of research of the genetics of this important contributor to human fertility.] STUDY FUNDING/COMPETING INTEREST(S)[All authors declare having no conflict of interest in regard to this research. This work was funded by the Slovenian Research and Innovation Agency (ARIS) Programme grant P3-0326: Gynecology and Reproduction: Genomics for personalized medicine] Lay summaryMale infertility affects about 5% of adult males and has complex causes, including genetic ones, such as mutations in the CFTR gene, small deletions on chromosome Y, and balanced translocations, but currently we can only find a genetic cause in [~]30% of patients. This means [~]70% of cases remain undiagnosed but potentially, they too may have a yet unknown genetic cause. Indeed, so far research has shown at least 265 genes have been proposed to play a role in male fertility. In these genes, there has so far been limited research of single nucleotide variants and of copy number variants, but many structural variants are not visible using commonly used methods in clinical genetic testing. Therefore, apart from chromosome Y microdeletions and chromosomal numerical and structural anomalies, such as balanced translocations, the role of smaller structural variants in male infertility is unknown, but based from what we know from other diseases, they also may play a role in male infertility. Optical genome mapping is a novel method for the detection of structural variants, such as balanced and unbalanced translocations, insertions, duplications, deletions, and complex structural rearrangements in a wide range of sizes. By using optical genome mapping to test a cohort of 88 infertile men and 132 healthy controls, we aimed to provide the first insights into the range of SV that may be associated with MI. We found, by using optical genome mapping, the overall number of structural variants in MI patients not to be significantly different to the control group. However, by looking at genes and regions associated with MI, we can find rare structural variants that are absent from controls in 25.0% of MI patients. These variants include inversions, duplications, amplifications, deletions (e.g. deletion in SPAG1), and insertions/expansions (e.g. in DMPK), that were validated using additional methods. Five of these variants (5.6%) were likely causative, and two would be missed by traditional approaches. While the remaining SV cannot be currently classified as pathogenic according to existing criteria, they open a new avenue in genetic research of MI.
Gandhi, N. R.; Fernandes Gyorfy, M.; Paradkar, M.; Jennet Mofokeng, N.; Figueiredo, M. C.; Prakash, S.; Prudhula Devalraju, K.; Hui, Q.; Willis, F.; Mave, V.; Andrade, B. B.; Moloantoa, T.; Kumar Neela, V. S.; Campbell, A.; Liu, C.; Young, A.; Cordeiro-Santos, M.; Gaikwad, S.; Karyakarte, R. P.; Rolla, V. C.; Kritski, A. L.; Collins, J. M.; Shah, N. S.; Brust, J. C. M.; Lakshmi Valluri, V.; Sarkar, S.; Sterling, T. R.; Martinson, N. A.; Gupta, A.; Sun, Y. V.
Show abstract
Understanding host susceptibility to Mycobacterium tuberculosis (Mtb) is critical for the development of new vaccines. Certain individuals "resist" becoming infected with Mtb despite intensive exposure; however, it is unknown whether there is a genetic basis for "resistance" to Mtb infection across populations. Here we conducted a genome-wide association study (GWAS) of resistance to Mtb infection by carefully characterizing exposure to TB patients among 4,058 close contacts in India, Brazil, and South Africa. 476 (12%) "resisters" remained free of Mtb infection despite substantial exposure to highly infectious TB patients. GWAS identified a novel chromosome 13 locus (rs1295104126) associated with resistance across the multi-ancestry meta-analysis. Comparing Mtb-infection to all uninfected contacts, irrespective of exposure, yielded a different locus on chromosome 6 (rs28752534), near the HLA-II region. These findings demonstrate a common genetic basis for resistance to Mtb infection across multi-ancestral cohorts with potential to elucidate novel mechanisms of protection from Mtb infection.
Liu, S.; Szabo, A.; Zarouchlioti, C.; Bhattacharyya, N.; Nguyen, Q.; Abreu Costa, M.; Luben, R.; Dudakova, L.; Skalicka, P.; Horak, M.; Khawaja, A.; Pontikos, N.; Muthusamy, K.; Tuft, S.; Liskova, P.; Davidson, A.
Show abstract
PurposeFuchs endothelial corneal dystrophy (FECD) is a common corneal disease and a leading indication for endothelial keratoplasty (EK). Although CTG18.1 repeat expansion is a major genetic risk factor, the contribution of polygenic background to disease progression remains unclear. We evaluated whether combining CTG18.1 expansion status with a FECD-specific polygenic risk score (PRS) enables genomic prediction of progression to EK. MethodsWe retrospectively analysed 589 individuals with FECD from two European centers, with replication in an independent cohort of 185 individuals. Association of CTG18.1 expansion ([≥]50 repeats) and PRS with time to EK were evaluated using Cox models adjusted for sex and ancestry. ResultsExpansion-positive status was associated with earlier EK (HR 2.30; 95% CI 1.62- 3.26; P<.001). Addition of PRS improved prediction (C-index 0.614 vs 0.602; P=.014). Each 1-SD increase in PRS was associated with earlier EK (HR 1.16; 95% CI 1.03-1.30; P=.015), with replication in the validation cohort (HR 1.42; 95% CI 1.15-1.75; P=.001). ConclusionIntegration of monogenic and polygenic risk enables genomic prediction of FECD progression, supporting clinical genomic risk stratification to inform individualized monitoring and timing of intervention.
Chu, R.; Sun, A.; Qu, J.; Lu, M.
Show abstract
Biological age estimators quantify aging-related variation but provide limited insight into organ-specific aging processes. The retina enables non-invasive visualization of microvascular and neural structures and has emerged as a promising modality for biological age prediction. However, existing retinal aging models typically produce unidimensional age estimates with limited interpretability. Here we develop a deep learning framework based on a large-scale vision foundation model to estimate retinal biological age from fundus images and to characterize the physiological heterogeneity underlying retinal aging. Using a reference cohort of 56,019 relatively healthy individuals, the model achieved a Mean Absolute Error of 2.48 years in age prediction. Analysis of age deviations in a real-world clinical cohort (n = 46,369) revealed non-linear associations with cardiometabolic risk and population heterogeneity in aging patterns. Integrating multidimensional physiological profiling, feature attribution and unsupervised analysis, we identified distinct retinal aging signatures associated with systemic inflammation and hemodynamic variation. To further characterize age-related deviations, we introduced a residual learning framework that decomposes retinal aging signals into a normative age-related component and additional components associated with physiological variation, achieving a Mean Absolute Error of 1.80 years on the independent healthy test set. This approach provides an interpretable representation of retinal aging and a framework for studying organ-level aging processes and their relationship to systemic health using large-scale imaging data.
Ding, Y.; Sayaman, R. W.; Wolf, D.; Mortimer, J.; Mao, A.; Fejerman, L.; Gruber, S. B.; Neuhausen, S. L.; Ziv, E.
Show abstract
Somatic mutations and the tumor immune microenvironment in breast tumors are important predictors of treatment response and survival, yet data for Hispanic/Latina (H/L) women are limited. Here we analyzed whole exome sequencing data from tumor/normal pairs and RNAseq data from 748 H/L women and 388 non-Hispanic White (NHW) women. Overall, the somatic profiles in tumors from H/L women were similar to NHW women. However, somatic mutations in genome organizer CTCF were significantly more common in H/L women. We also found that tumor microenvironment immune ecotypes CE9 and CE10, characterized by increased lymphocyte infiltration and more favorable prognosis, were more common among women with higher Indigenous American ancestry. Finally, we found that a germline APOBEC3A/B copy-number deletion was more prevalent in H/L than in NHW and was associated with the COSMIC APOBEC mutational signatures and with CE10 ecotype. Overall, these results suggest that ancestry differences may provide insights into specific mutation and immune profiles.
Du, J.; Horimoto, A. R. V. R.; Best, L. G.; Zhang, Y.; Cole, S.; Umans, J. G.; Franceschini, N.; Sun, Q.
Show abstract
Polygenic scores (PGS) show promise for disease risk stratification but suffer from limited portability across populations. American Indians face a disproportionate burden of cardiovascular disease yet remain significantly underrepresented in genomic research, limiting equitable access to precision medicine. Here, we evaluate whether integrating specific lifestyle and clinical context variables with PGS enhances risk prediction for cardiometabolic traits in 424,622 European from UK Biobank (UKB) and 3,157 American Indian populations from the Strong Heart Study (SHS). By comparing genetics-only models to full models incorporating context variables and gene-context interactions across blood pressure traits, coronary heart disease (CHD), and stroke, we found that the integration of context variables significantly improved prediction accuracy in both cohorts. Notably, for American Indian participants, the new model incorporating context and genetic interactions significantly improved model discrimination for CHD compared to an established clinical risk model. These findings suggest that modeling the interplay between inherited risk and modifiable factors can recover predictive power loss due to imperfect PGS transferability, offering a viable pathway toward more equitable and effective precision medicine for under-represented populations.
Li, Y.; Liu, X.; Mao, P.; Zhou, T.; Fan, X.; Xie, G.; Ji, Y.; Wang, W.; Han, G.; Jiang, J.; Zhang, C.; Yang, J.
Show abstract
Pulmonary hypertension (PH) is a progressive condition characterized by increased pulmonary arterial pressure. Endothelial cell dysfunction is one important characteristic of PH. Recently, capillary endothelial cells, including aerocytes (aCaps) and general capillary cell (gCaps), have been detected in developing lungs but their role and the regulatory mechanisms underlying PH remain poorly understood. The goal of this study was to identify changes in Caps and their effects on hypertensive pulmonary circulation. We set up a Capillary Alveoli Micro-physiological System (CAMS) incorporated with hPSCs(human pluripotent stem cells)-aCaps to show loss of Cap connection under dynamically cultured hypoxic condition. We employed single-cell RNA sequencing (scRNA-seq) and immunofluorescence to demonstrate impaired gCaps differentiation with increased expression of cell membrane receptor CD93 in PH patients and a Sugen 5416/hypoxia (SuHx) rat model. Conditional Knockdown or Lentiviral overexpression of CD93 alleviated the pathology observed in SuHx mice. We also revealed that CD93 overexpression upregulated SMAD2/3 to repress Apelin (APLN) expression by CHIP assay. Finally, supplementation with an APLNR agonist in the PH rat model promoted gCaps-to-aCaps differentiation and improved haemodynamic indices. Overall, our results highlight the potential for promoting capillary cell differentiation with G protein biased APLNR agonist as a therapeutic strategy for pulmonary vascular disease.
Zhao, X.; Niederhauser, T.; Balazs, Z.; Wicki, A.; Fan, B.; Krauthammer, M.
Show abstract
Guideline-based recommendations for metastatic lines of therapy (mLoTs), especially second lines and beyond, are comparatively sparse due to challenges in later-line treatment efficacy quantification. Scalable real-world evidence that captures the interaction between treatment and disease progression is therefore especially valuable, as regimens become increasingly individualized, confounding intensifies, and progression is rarely recorded as a structured EHR endpoint. We present a framework to (i) reconstruct clinically coherent mLoTs from longitudinal EHR using radiology-anchored progression evidence and (ii) generate individualized progression-free survival (PFS) estimates from a line-start multimodal snapshot in a highly heterogeneous cohort. In 2,881 patients contributing 8,791 metastatic mLoTs, the selected model shows strong discrimination over a 2-year horizon (Antolinis C = 0.680 {+/-} 0.006; cumulative/dynamic AUC at 1 year = 0.824 {+/-} 0.006). Predicted risk strata closely track Kaplan-Meier trends across line number and tumor subtypes, enabling calibrated risk stratification even in smaller sub-cohorts. Model prediction primarily relies on clinically plausible signals of recent metastatic burden and tumor markers, with limited dependence on surveillance cadence or subtype labels, and is robust to missingness. Together, this framework supports scalable evidence generation and interpretable, calibrated prognostication to inform risk assessment and care planning in heterogeneous metastatic practice.